Gradio and LLM

Integrate the application into Gradio

Having successfully created a Q&A bot with your script, you might notice that responses are only displayed in the terminal. You may wonder if it's possible to integrate this application with Gradio to leverage a web interface for inputting questions and receiving responses.

The following code guides you through this integration process. It includes three components:

  • Initializing the model
  • Defining the function that generates responses from the LLM
  • Constructing the Gradio interface, enabling interaction with the LLM
    • The gr.Textbox element is being used to create text field and hold users' input query and LLM’s output.
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  1. # Import necessary packages
  2. from ibm_watsonx_ai.foundation_models import ModelInference
  3. from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
  4. from ibm_watsonx_ai import Credentials
  5. from langchain_ibm import WatsonxLLM
  6. import gradio as gr
  7. # Model and project settings
  8. model_id = 'mistralai/mixtral-8x7b-instruct-v01' # Directly specifying the model
  9. # Set necessary parameters
  10. parameters = {
  11. GenParams.MAX_NEW_TOKENS: 256, # Specifying the max tokens you want to generate
  12. GenParams.TEMPERATURE: 0.5, # This randomness or creativity of the model's responses
  13. }
  14. project_id = "skills-network"
  15. # Wrap up the model into WatsonxLLM inference
  16. watsonx_llm = WatsonxLLM(
  17. model_id=model_id,
  18. url="https://us-south.ml.cloud.ibm.com",
  19. project_id=project_id,
  20. params=parameters,
  21. )
  22. # Function to generate a response from the model
  23. def generate_response(prompt_txt):
  24. generated_response = watsonx_llm.invoke(prompt_txt)
  25. return generated_response
  26. # Create Gradio interface
  27. chat_application = gr.Interface(
  28. fn=generate_response,
  29. allow_flagging="never",
  30. inputs=gr.Textbox(label="Input", lines=2, placeholder="Type your question here..."),
  31. outputs=gr.Textbox(label="Output"),
  32. title="Watsonx.ai Chatbot",
  33. description="Ask any question and the chatbot will try to answer."
  34. )
  35. # Launch the app
  36. chat_application.launch(server_name="127.0.0.1", server_port= 7860)
  1. Navigate to the PROJECT directory, right-click, and create a new file named llm_chat.py.
  2. Input the script provided above into this new file.
  3. Open your terminal and ensure you are within the my_env virtual environment.
  4. Execute the following code in the terminal to run the application.
  1. 1
  1. python3.11 llm_chat.py

After it has excuted successfully, you will see message similar to the following in the terminal:

Alt text

  1. Click the following button to launch and view the application.

The chatbot you have successfully created will be displayed, appearing as follows:

Alt text

Now, feel free to ask any question to the chatbot.

Here is an example of the question asked:
Alt text

(To terminate the script, press Ctrl+C in the terminal and close the appliction window.)

Exercise

You might observe that the responses from the LLM are occasionally incomplete. Could you identify the cause of this issue? Also, would you be able to modify the code to enable the model to generate more content?

Actually, all you need to do is:

Click here for the answer
  1. 1
  1. "max_new_tokens": 512, # adjust the parameter `max_new_token` to a bigger value

In the following section, we will explore how to develop a resume polisher using the knowledge we have just acquired.